How COVID-19 affects voting for incumbents: Evidence from local elections in France

How do voters react to an ongoing natural threat? Do voters sanction or reward incumbents even when incumbents cannot be held accountable because an unforeseeable natural disaster is unfolding? We address this question by investigating voters’ reactions to the early spread of COVID-19 in the 2020 French municipal elections. Using a novel, fine-grained measure of the circulation of the virus based on excess-mortality data, we find that support for incumbents increased in areas that were particularly hard hit by the virus. Incumbents from both left and right gained votes in areas more strongly affected by COVID-19. We provide suggestive evidence for two mechanisms that can explain our findings: an emotional channel related to feelings of fear and anxiety, and a prospective-voting channel, related to the ability of incumbents to act more swiftly against the diffusion of the virus than challengers.


C.2 Population extrapolation
Population data at the municipality level is only available with a 3-year lag1 .To obtain population data for 2018, 2019 and 2020, we extrapolated the population from the previous years by estimating the municipality-specific evolution trend over the years 2010-2017.We added up the population of the municipalities that merged in 2020 but were distinct at some point in the past.
We consider three basic extrapolation models: • constant population • linear increase with time

• linear increase with time on the log scale
To predict the population in 2019 and 2020 for each municipality, we aim to use the best model according to a parsimony criterion and a goodness of fit criterion.More precisely, this can be formalised in the following four models.Let N mt be the population of municipality m at year t: For our parsimony criterion, we deem reasonable to rank the models by increasing complexity using the following order: 1) constant population (Equation (2)), 2) constant population on the log scale (Equation (3)), 3) linear trend (Equation (4)) 4) linear trend on the log scale (Equation ( 5)).
For each model and each municipality, our goodness of fit criterion is the Akaike Information

Criterion (AIC, [SI9]
).A difference in AIC of 3 is sometimes considered as reasonable evidence in favor of one of the two models ( [SI10]).When the difference between the best model according to the AIC criterion and a simpler model according to the complexity criterion is smaller than 3, we prefer the simpler model.Therefore, our mixed criteria model selection procedure consists of choosing the simplest model within a radius of 3 of the best model according to the AIC.We also considered the Bayesian Information Criterion [SI9] instead of AIC and obtained similar results.

Excess mortality models
We model excess mortality in 2020 compared to the previous years through a Poisson model with municipality-specific intensities.We first introduce a basic model for excess mortality which does not take into account the age of the deceased.The purpose of the basic model is to show the added value of the main model, which accounts for the dependence on age and sex of the probability to die once infected (Infection Fatality Ratio, IFR).
For the basic model, let Y mt denote the number of deaths between March 15th and six weeks later in municipality m at year t, and let N mt denote the population of that municipality on that year t.We make the common assumption (see for instance [SI12,14]) that mortality follows a Poisson distribution with a baseline-municipality specific hazard h m for 2015 ≤ t ≤ 2019 plus a municipality-specific excess hazard h + m for t = 2020: In this model, h + m characterises the severity of the COVID-19 outbreak in the municipality.A natural consequence of this model is that the excess probability to die from COVID-19 is the same for every person in any given municipality, which we know to be an oversimplification of reality.
We therefore introduce a model accounting for the differential probability of dying from COVID-19 infection depending on age and sex, which has been estimated using data from various countries in [14].
Starting from the mortality records, we attribute each death record to a 5-year age-class matching the classes defined by [14].We therefore use the best age resolution available for the age-specific Infection Fatality Rate.
We discard death records for children under 10 and for people over 80 because excess death in these age classes can be linked to a variety of causes and are not very informative about the level Let Y mtas denote the number of deaths between March 15th and six weeks later in municipality m at year t in age class a for sex s, and let N mtas denote the population in that age and sex class of that municipality at year t.As in the previous section, we assume that mortality follows a Poisson distribution with a baseline municipality-age-sex specific hazard h mas for t ∈ 2015 : 2019 plus a municipality specific excess hazard h + mas for t = 2020, where the excess hazard involves the COVID-19 prevalence in the municipality ν m and the age-specific Infection Fatality Ratio ρ as .
We define an age-class continuous variable x a indicating the index of the age-class (which corresponds to a binning of the age, thus proportional to age) which we scale and center.We further define a sex-class binary variable x s where the value 1 corresponds to males.These quantities are combined together in the following model: h m is the baseline hazard in municipality m for females in the central age-class.β age and β age are the country-wide age and sex effects on the baseline mortality.
Note that in this model the prevalence ν m is a municipality-level quantity, which is the same for all age and sex classes.We use ν m to characterize the severity of the COVID-19 outbreak in municipality m.

Prior structure
A striking feature of the excess hazard due to COVID-19 in 2020 is that in the period considered, only a few hotspots of COVID infection were scattered around the country, which implies that most municipalities were not affected yet.As a consequence, it is realistic for most h + m and ν m values to have a negligible value.
We include this insight in the model by performing inference using the Bayesian framework and specifying a sparsity-inducing Lasso prior [SI13] h + m and ν m , and resorting to Maximum a Posteriori (MAP) estimation.We use the following prior structure for the first model, not accounting for the age and sex-specific IFR: where N denotes the Normal distribution, N + denotes the truncated Normal distribution on the positive real numbers and logN + denotes the lognormal distribution.
We use the following prior structure for the second model which accounts for the age and sex-specific IFR: λ ∼ Gamma(0.001,0.001) We use very vague priors on the location and scale parameters for the baseline mortality.For h + m and ν m we chose a prior on R + with positive mass at 0 to allow for shrinkage of the MAP to 0. Rather than using cross-validation to select the optimal penalty parameter, we follow the recommendations of [SI13] and [SI14] and use a Gamma distribution for the Lasso parameter λ, although we place this prior on λ rather than λ 2 since we do not have conjugacy constraints.This is essentially the same prior structure for both models, with for the second model a weakly informative prior on the age and sex effect on the baseline mortality.Since the sex covariate is binary and the age covariate is scaled and centred, the priors allows everything from negligible to large positive or negative effects.

Inference
Given the large number of municipalities, the previous models have a large number of parameters, on the order of 60 000.This renders standard Monte Carlo Markov Chain inference very time-consuming.
Since the goal of this analysis is to obtain point estimates for the municipality-level excess hazard or the municipality-level prevalence, we resort to Variational Inference, which is known for good performance in obtaining point estimates for a posterior distribution [SI15].We use a stochastic variational inference algorithm [SI16] through the interface available in Stan [SI17].Sparsity is obtained by calculating MAP values conditional on the optimal value of the penalty parameter λ, estimated by the median marginal posterior distribution on λ.Optimisation of the posterior to obtain the MAP is performed using the default Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithm implementation in Stan.

C.4 Estimated values and validation
Posterior predictive model checking and the prevalence (Equation ( 8)) are more robust because they take into account the size of the population in the Poisson intensity and they benefit from the country-wide shrinkage (Equations ( 9) and ( 10)) for the baseline mortality estimates and from the lasso penalty.
This increased robustness is expected to play a role at the municipality level, where populations may be small and fluctuations large.At the regional level, with larger aggregated population, the

Validation by comparison with external measures
We compare our measure of the spread of COVID-19 (prevalence) with other measures obtained from publicly available data.These measures are rarely available at the municipality level and may require aggregating prevalence at the county or regional level.First, we use county-level data from hospitals, such as the number of hospitalised people, the number of people in intensive care or the number of deceased per inhabitant.3These measures should be relatively noise-free, assuming that the quality of hospital reporting is high, but they are still partial, as they document only the gravest cases of COVID-19 infection.These data contain information closely related to excess death, and we observe a correlation with our prevalence measure (Fig 10).
Second, we retrieve data from the CoviPrev survey4 , which tracked the evolution of mental health and general behaviour -such as the adoption of preventative measures -in the French population since March 23, 2020.These data are recorded at the regional level.Since our measure of prevalence can be imprecise for municipalities with a small population, Note: The mean predictive estimate for the number of death in each age and sex category, for each municipality and each year is plotted against the observed number of deaths.The prediction is performed using the MAP estimates for the parameters.The vertical segments denote the 95% prediction interval; they have a light colour if they cover the observed data and a dark colour if they do not.The identity line is denoted in red.

Fig S1 .
Fig S1.Testing mechanisms (alternative specification of COVID-19 spread) Fig S2.DAG justifying the choice of control variables Fig S3.Testing the parallel trend assumption Fig S4.Weekly death count in France between 2001 and 2020 Fig S5.Number of recorded death on each year Fig S6.Random selection of municipalities with a population larger than 1000 Fig S7.Worst fitting municipalities among the 34972 municipalities Fig S8.Three measures of COVID-19 severity aggregated at the regional level Fig S9.Posterior predictive check for the prevalence model (Equation (8)) Fig S10.Correlation between the COVID-19 outbreak severity measure and various measures from the departmental-level hospital dataset Fig S11.Correlation between the COVID-19 outbreak severity measure and the level of anxiety reported during the first week after the election.Every measure is at the regional level Fig S12.Correlation between the COVID-19 outbreak severity measure and the log distance between each municipality and the nearest COVID-19 hotspot

Fig S2 .
Fig S2.DAG justifying the choice of control variables

Fig S3 .
Fig S3.Testing the parallel trend assumption

Fig S5 .
Fig S5.Number of recorded death on each year, aggregated at country level and during the period of interest.

Fig S6 .
Fig S6.Random selection of municipalities with a population larger than 1000 The dark points indicate observed data, the light points indicate extrapolated data.The panel label denotes the INSEE code for the municipality considered.times (84%).Fig 6 shows a random sample of the municipalities with the data used for fitting the extrapolation model and the extrapolation.In all cases, the extrapolation seems reasonable.Fig 7

Fig S7 .
Fig S7.Worst fitting municipalities among the 34972 municipalities The dark points indicate observed data, the light points indicate extrapolated data.The panel label denotes the INSEE code for the municipality considered.
of COVID-19 spread (see notably the dispersion of Infection Fatality Rate for young children and people over 80 on Figure 1.b. of [14]).

Fig 8
Fig 8 compares the two estimations of COVID-19 outbreak severity to a naive estimate, the empirical excess hazard, aggregated at the regional level.The empirical excess hazard is obtained by dividing the number of deaths by the population size for each year, then subtracting to the empirical hazard for 2020 the mean hazard for the previous years.This empirical estimate does not take into account population size and is thus sensitive to outliers.The model-based excess hazard (Equation (7)) Fig 11 shows a strong correlation between the spread of Covid-19 and anxiety at the regional level, lending support to our measure of the level of threat perceived by the voters.Third, we use data compiled by [55] who estimate the spread of COVID-19 using the distance to the COVID-19 hotspots known to be active at the time of the election.This measure can be compared directly with our measure since it is available at the municipality level.As expected, our measure of prevalence is negatively correlated with the log-distance of the municipalities to the nearest COVID-19 hostpot (Fig 12).

Table S6 .
First-difference regression models

Table S8 .
Interactions with political affiliation of incumbents

Table S9 .
Propensity score matching

Table S10 .
Placebo test: effect of COVID-19 on vote for incumbents in 2014

Table S12 .
DiD in regression framework

Table S13 .
Testing the parallel trend assumption

Table S14 .
Testing mechanisms with interaction models

Table S1 .
Predictors of city with incumbent (vs.no incumbent) Note: Summary statistics at the municipality level.Only municipalities with more than 1000 inhabitants and an incumbent candidate in 2020.

Table S3 .
Political affiliation of incumbents

Table S4 .
Effects of COVID-19 on voting for incumbents

Table S8 .
Interactions with political affiliation of incumbents

Table S9 .
Propensity score matching Note: OLS regression coefficients with standard errors in parentheses.Sample of matched municipalities.The models include the same covariates included in TableS4.* p<.05, ** p<.01, *** p<.001

Table S10 .
Placebo tests: Effect of COVID-19 on voting for incumbents in 2014 Note: OLS regression coefficients with standard errors in parentheses.Dependent variable: vote share for incumbents in 2014 (first round).The models include the same covariates included in TableS4(apart from turnout in 2014 and number of candidates in 2020).* p<.05, ** p<.01, *** p<.001

Table S12 .
DiD in regression framework Note: Regression coefficients with bootstrapped standard errors in parentheses.Treatment defined as being in the third or fourth quartile of municipalities affected by Covid-19.* p<.05, ** p<.01, *** p<.001

Table S13 .
Testing the parallel trend assumption Note: Regression coefficients with standard errors clustered at the municipality level in parentheses.Sub-sample of municipalities with same mayor in 2008, 2014 and 2020 (N=204).High Covid-19 spread defined as being in the third or fourth quartile of municipalities affected by Covid-19.* p<.05

Table S14 .
Testing mechanisms model-based estimate of excess hazard and the empirical estimates should be close, and this is indeed what we see on Fig 8, where we see large excess hazard in the Grand-Est region and the Île-de-France region, as expected.Since the prevalence measure takes into account the age structure of the population, it can provide different values depending on the age at death of the people on the mortality records.Indeed, we see that our assessment of the severity of the COVID-19 outbreak is fairly different from the basic excess hazard model ((Equation (7))), with the Île-de-France region now much more severely affected than the other regions.Because of the comparatively older population in the Grand-Est region, a COVID-19 outbreak comparable to other regions caused a large excess mortality.Conversely, the prevalence measure reveals that although excess hazard was not the highest in the Île de France region, the COVID-19 outbreak was particularly severe, given the markedly younger population in counties such as Seine-Saint-Denis.We perform posterior predictive checks for our main model (Equation (8)), where we predict the number of deaths for each municipality based on our MAP estimates and compare them with the observed number of deaths.Fig9shows that predictions based on the MAP estimates generally perform well.The 95% prediction intervals mostly cover the true value, with an average coverage of 99% , which suggests that the data is slightly under-dispersed compared to a Poisson error model.